Improved disk-drive failure warnings

نویسندگان

  • Gordon F. Hughes
  • Joseph F. Murray
  • Kenneth Kreutz-Delgado
  • Charles Elkan
چکیده

Improved methods are proposed for disk-drive failure prediction. The SMART (Self Monitoring and Reporting Technology) failure prediction system is currently implemented in disk-drives. Its purpose is to predict the near-term failure of an individual hard disk-drive, and issue a backup warning to prevent data loss. Two experimental tests of SMART show only moderate accuracy at low false-alarm rates. (A rate of 0.2% of total drives per year implies that 20% of drive returns would be good drives, relative to 1% annual failure rate of drives). This requirement for very low false-alarm rates is well known in medical diagnostic tests for rare diseases, and methodology used there suggests ways to improve SMART. Two improved SMART algorithms are proposed. They use the SMART internal drive attribute measurements in present drives. The present warning-algorithm based on maximum error thresholds is replaced by distribution-free statistical hypothesis tests. These improved algorithms are computationally simple enough to be implemented in drive microprocessor firmware code. They require only integer sort operations to put several hundred attribute values in rank order. Some tens of these ranks are added up and the SMART warning is issued if the sum exceeds a prestored limit. These new algorithms were tested on 3744 drives of 2 models. They gave 3–4 times higher correct prediction accuracy than error thresholds on will-fail drives, at 0.2% false-alarm rate. The highest accuracies achievable are modest (40%–60%). Care was taken to test will-fail drive prediction accuracy on data independent of the algorithm design data. Additional work is needed to verify and apply these algorithms in actual drive design. They can also be useful in drive failure analysis engineering. It might be possible to screen drives in manufacturing using SMART attributes. Marginal drives might be detected before substantial final test time is invested in them, thereby decreasing manufacturing cost, and possibly decreasing overall field failure rates.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Failure Trends in a Large Disk Drive Population

It is estimated that over 90% of all new information produced in the world is being stored on magnetic media, most of it on hard disk drives. Despite their importance, there is relatively little published work on the failure patterns of disk drives, and the key factors that affect their lifetime. Most available data are either based on extrapolation from accelerated aging experiments or from re...

متن کامل

Bayesian approaches to failure prediction for disk drives

Hard disk drive failures are rare but are often costly. The ability to predict failures is important to consumers, drive manufacturers, and computer system manufacturers alike. In this paper we investigate the abilities of two Bayesian methods to predict disk drive failures based on measurements of drive internal conditions. We first view the problem from an anomaly detection stance. We introdu...

متن کامل

Disk Drive Failure Factors

Modern disk drives have become highly complex with increasing demands on high throughput, high reliability, high capacity, and low cost most of which are typically mutually exclusive. This paper takes a look at the reliability aspect of disk drives, and, more specifically, factors that cause a drive to fail. Failures fall into three main categories: hardware, software, and process. Many of the ...

متن کامل

Fretting Wear between a Hollow Sphere and Flat Surface

Wear particles in a hard disk drive may cause the head/disk interface to fail. We have experimentally investigated wear particle generation resulting from fretting wear between the dimple on the suspension and the gimbal spring. We have found that increasing the normal load as well as using a low friction coating reduces the formation of wear particles. INTRODUCTION Examination of failed hard d...

متن کامل

Dynamic characterisation of the head-media interface in hard disk drives using novel sensor systems

Hard disk drives function perfectly satisfactorily when used in a stable environment, but in certain applications they are subjected to shock and vibration. During the work reported in this thesis it has been found that when typical hard disk drives are subjected lo vibration, data transfer failure is found to be significant at frequencies between 440Hz and 700Hz, at an extreme, failing at only...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Reliability

دوره 51  شماره 

صفحات  -

تاریخ انتشار 2002